The Applications Of Unsupervised Learning To Japanese Grapheme-Phoneme Alignment

نویسندگان

Timothy Baldwin

Hozumi Tanaka

چکیده

In this paper, we adapt the TF-IDF model to the Japanese grapheme-phoneme alignment task, by way of a simple statistical model and an incremental learning method. In the incremental learning method, grapheme-phoneme alignment paradigms are disambiguated one at a t ime according to the relative plausibility of the highest scoring alignment schema, and the statistical model is re-trained accordingly. On limited evaluation, the learning method achieved an accuracy of 93.28%, representing a slight improvement over a baseline rule-based method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparative Study of Unsupervised Grapheme-Phoneme Alignment Methods

This paper describes and compares two unsupervised algorithms to automatically align Japanese grapheme and phoneme strings, identifying segment-level correspondences between them. The first algorithm is inspired by the tf-idf model, including enhancements to handle phonological variation and determine frequency through analysis of “alignment potential”. The second algorithm relies on the C4.5 c...

متن کامل

A Novel Approach to Unsupervised Grapheme–to–phoneme Conversion

Automatic, data-driven grapheme-to-phoneme conversion is a challenging but often necessary task. The top-down strategy implicitly adopted by traditional inductive learning techniques tends to dismiss relevant contexts when they have been seen too infrequently in the training data. This paper proposes instead a bottom-up approach which, by design, exhibits better generalization properties. For e...

متن کامل

Efficient Grapheme-phoneme Alignment for Japanese

Current approaches to the grapheme-phoneme alignment problem for Japanese achieve good accuracy, but are extremely computationally expensive. In this paper we evaluate various modifications to previous algorithms for both the alignment and okurigana detection subtasks. The best algorithm achieved accuracy of 96.2% for the combined task on a limited data set, and was significantly more efficient...

متن کامل

Automated Japanese grapheme-phoneme alignment

This paper describes an adapatation of the tf-idf model to Japanese graphemephoneme alignment, without reliance on training data. The tf-idf model is optionally complemented with affixation and conjugation handling modules, and determines frequencies through analysis of “alignment potential”. The proposed system achieved a maximum accuracy of 94.74% on evaluation.

متن کامل

A latent analogy framework for grapheme-to-phoneme conversion

Data-driven grapheme-to-phoneme conversion involves either (top-down) inductive learning or (bottom-up) pronunciation by analogy. As both approaches rely on local context information, they typically require some external linguistic knowledge, e.g., individual grapheme/phoneme correspondences. To avoid such supervision, this paper proposes an alternative solution, dubbed pronunciation by latent ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1999

The Applications Of Unsupervised Learning To Japanese Grapheme-Phoneme Alignment

نویسندگان

چکیده

منابع مشابه

A Comparative Study of Unsupervised Grapheme-Phoneme Alignment Methods

A Novel Approach to Unsupervised Grapheme–to–phoneme Conversion

Efficient Grapheme-phoneme Alignment for Japanese

Automated Japanese grapheme-phoneme alignment

A latent analogy framework for grapheme-to-phoneme conversion

عنوان ژورنال:

اشتراک گذاری